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ABSTRACT 


This thesis aims to improve image throughput from satellite to Earth by using Artificial Vision 
to perform data compression before the downlink. Onboard Analysis for Selective Imagery 
Compression (OASIC) is a hybrid compression algorithm designed for oceanic imagery, in¬ 
corporating both lossless and lossy compression methods to achieve a high compression ratio 
with minimal noise on vessels of interest. This is achieved by separating the vessels from 
the surrounding ocean and storing them with high fidelity, while compressing the remainder 
of the image with low fidelity. The performance of OASIC is examined on full resolution 
panchromatic satellite images and compared to both lossless and lossy JPEG2000 compressed 
images. In nearly all configurations tested, OASIC outperforms JPEG2000, achieving an aver¬ 
age fifteen-fold improvement in compression ratios while maintaining a nearly lossless fidelity 
for the vessels within the OASIC compressed images. This content-sensitive compression al¬ 
gorithm can potentially enable the transmission of higher spatial resolution images, with more 
spectral bands, and at higher download speeds from space. 
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CHAPTER 1: 
Introduction 


In this chapter, the motivation for the development of what is called the OASIC algorithm 
(pronounced "oasis") and the problem it aims to solve are discussed. The goals of this research 
are to apply artificial vision to digital imagery compression and to compare its performance to 
conventional image compression methods. 

1.1 Background 

When speaking over a radio, it is considered good practice to keep your report brief, to the 
point and avoiding any unnecessary transmission so that one does not inadvertently tie up the 
scarce resources of the radio network. Conservation of channel capacity, as a critical resource, 
is mandatory for satellite communications to the Earth due to both the limited transmit power 
of the satellite as well as the increasingly crowded spectrum used by satellites in space. An 
additional hurdle is the satellite may be operating in a contested environment where capacity is 
severely reduced. 

The Onboard Analysis for Selective Imagery Compression algorithm (OASIC) aims to conserve 
satellite channel capacity when transmitting oceanic imagery to Earth. OASIC conserves chan¬ 
nel capacity by improving data compression by assuming the only objects within the image that 
require high fidelity are ships. Through the use of artificial vision, OASIC attempts to classify 
all pixels within an image as either ship or other, less important characteristics such as waves, 
visible seabed, clouds and other such phenomena. 

1.2 Satellite Imagery 

The concept of acquiring imagery from above dates back to antiquity where scouts would climb 
the high peaks overlooking a rival camp to gather intelligence or climb a tree to help navigate 
through rough terrain. 

As technology improved, so too did the altitude of the observer. Erom hot air balloons to 
hydrogen filled dirigibles to high altitude aircraft such as the U-2 and finally to orbiting satellites 
the quest to see more has driven the observer from the atmosphere and into orbit. Satellite- 
borne observation has its roots in the late 1950s era Corona program developed by the United 
States, which used analog film cameras and airdropped canisters to return imagery to Earth. 
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The transition away from film to radio signals and eventually to digital transmissions eemented 
the imaging satellite’s presenee in outer space. 

Advancements in optical detectors allow for ever higher image resolutions and the detected 
spectra can now span from far infrared to ultraviolet, with multiple polarizations. With more 
resolution and spectra, however, more channel capacity is required to send the information to 
the Earth. 

1.3 Downlink Limitations 

Transmissions from a satellite to the surface of the Earth are referred to as downlink. 

The first limitation the downlink faces is power. Imaging satellites are typically solar powered, 
and require ever larger and more elaborate solar arrays to generate sufficient power to keep up 
with the demands of their various powered systems including the transmitter. Highly success¬ 
ful commercial imaging satellites such as World View-2 require a large 3.2kW solar array to 
provide enough power to operate. 

The second limitation is signal noise. Earth, the location of the receiver, is an electromagnet- 
ically noisy environment and the satellite itself must contend with its own internal electronic 
noise as well as signal distortions induced by natural radiation in space. 

Satellites are also restricted by mass and physical dimensions, limiting the transmitting antenna 
dish area and necessitating ever more creative methods of collapsible antennas to push the 
envelope. World View-2 weighs 3.2 tons, with much of that mass dedicated to power. 

The transmission carrier to noise ratio (C/Nq) is defined in Equation 1.1 and computed with the 
gain of the transmitter dish A,, its power Pt, and gain of the receiver dish (A^). K is Boltzmann’s 
constant, the temperature Tg of the transmitter (in Kelvin), and Lp and Lj are free space and 
atmospheric losses, respectively. 


C _ AtPt{LpLgi)Ar 
^ 0 ~ 
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Channel capacity, is the rate at which bits can be propagated through the range of frequencies 
used by the satellite and is calculated by Shannon’s Limit shown in Equation 1.2. Channel 
capacity 7 is measured in bits per second (bps) and is directly proportional both frequency 
bandwidth B in Hertz and the carrier to noise ratio calculated above. 

l<B-Log 2 -{\ + ^) (1.2) 

A typical imagery satellite such as Digital Globe’s World View-2 satellite orbits at an altitude of 
770 km, in what is known as Low Earth Orbit (LEO). LEO offers the closest view of the Earth, 
improving image resolution but limiting the time the satellite is able to downlink its images to 
any particular ground station. The orbital period for LEO (the time it takes to complete a single 
orbit) is measured in minutes (100 minutes for World View-2, for instance), with a receiving 
station only in view for a small fraction of that time. Time is the final limitation, and can be 
mitigated by the addition of more ground stations, increased channel capacity or relaying the 
transmission through other satellites. 

A satellite such as World View-2 captures up to 331 Gbits of imagery on a single orbit, but 
requires an 800 Mbps downlink throughput to transmit the data to Earth. Any data not able 
to downlink may have to be stored in a finite on-board storage and wait, up to an hour, to 
resume the downlink. These limitations only grow more pronounced as technology continues 
to improve and satellites demand more channel capacity than the solar arrays, antennas and 
low-noise amplifiers can provide. 

One promising solution is to improve data compression and use the existing channel capacity 
more efficiently. 

1.4 Data Compression 

The concept of data compression revolves around the concept of representing a data set with 
less bits than the original data represents. Lossless data compression reduces the amount of bits 
needed to represent data by taking advantage of statistical redundancy within the source data. 
The original data is reconstituted entirely with no errors when lossless compression is used. 

Lossy data compression, however, takes advantage of the relative importance of some data over 
other and aims to quantize or remove the less important data. Eor the popular lossy music 
compression standard MPEG Layer 3 (MP3) the audio frequencies and tonal components of 
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the audio outside normal human perception are removed or reduced, leading to tremendous 
compression efficiency with little to no perceived loss of quality. 

JPEG, the de-facto graphical image standard used by the World Wide Web, similarly takes 
advantage of the limits of human perception by reducing the fidelity of the color space while 
preserving the luminosity. 

OASIC is a lossy image compression algorithm that aims to preserve the quality of the vessels 
while sacrificing everything else. It is also intended to ultimately be implemented aboard imag¬ 
ing satellites, and be able to operate within the memory and processor constraints dictated by 
their architecture. 

1.5 Research Goals 

The purpose of this research is to validate the concept of Content-Aware Adaptive Compression 
of Satellite Imagery Using Artificial Vision. The OASIC algorithm is used to compress and 
uncompress actual satellite images in order to analyze the compression performance and fidelity 
losses. This research aims to show that OASIC not only compresses oceanic satellite images 
better than contemporary compression techniques such as JPEG2000, but also does so with less 
degradation to the vessels within the images. 
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CHAPTER 2: 
Related Work 


In this chapter, the methods of feature extraction, classification, and compression are discussed. 
The goal of OASIC is to reduce the amount of channel capacity consumed by an orbiting imag¬ 
ing satellite when transmitting captured images to the surface. Artificial vision based ship 
detection algorithms are well researched, and there are several examples that share similarities 
to the OASIC algorithm. The area of digital data compression, especially image compression, 
is also well researched. The OASIC algorithm incorporates these two distinct topics. 

2.1 Low Level Feature Extraction 

Low level features are the smallest units of information of an image that are read directly from 
the digital medium. 

2.1.1 Discrete Wavelet Transform 

In the area of Computer Vision, there are many proven methods of low level feature extraction. 
OASIC uses the Discrete Wavelet Transform (DWT), as according to Meyer [1], it takes ad¬ 
vantage of the relatively low energy of ocean texture compared to ship texture in the frequency 
domain yielded by the wavelet transform. The DWT is also adept at extracting desired objects 
from images saturated with noise as described by Casasent [2]. 

The wavelet decomposition as described by Antonini [3] acts as a two-dimensional digital high- 
pass filter, removing all of the subtle changes in pixel intensity associated with ocean wave tops. 
This leaves only the features that abruptly differ from their neighboring environment. In effect, 
wavelet decomposition suppresses much of the natural ocean, while expressing the objects on 
the surface. 

As described by Tello [4] and Selvi [5], three of the four sub-band products of the DWT (HH, 
HL and LH) can be used to localize, down to a pixel, the existence of an object within a noisy 
background. According to both Tello [4] and Strickland [6], the Discrete Wavelet Transform is 
well suited for detecting edges in a noisy image because it natively suppresses noise. However, 
edges may not stand out against the noise at all resolutions, therefore multiple recursive wavelet 
decompositions may be required to detect a wide range of object sizes, forming a pyramid as 
described by Bogush [7]. 
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Huang [8] suggests that the optimal number of wavelet pyramid octaves is three to four. Ex¬ 
periments with OASIC detection and classification have experimentally determined that three 
octaves is the optimal number, agreeing with Huang’s research. Kiely [9] and Zhu [10] use this 
number of pyramid octaves as well. 

Tello’s use of wavelet decomposition differs from OASIC’s feature extraction in that it applies 
the correlation of a 4th sub-band (LL). In the case of OASIC, this sub-band is still decomposed 
to form the next octave of the pyramid, but is not directly supplied to the classifier for analysis. 

In the case of Tello, Corbane, Fang [II] and Huang [8], their papers include some type of de- 
noising stage pre or post feature extraction. This step is absent in OASIC as it expects relatively 
low energy noise common in optical imagery over much noisier SAR images cited in their work. 
OASIC also benefits from the inherent de-noising qualities of the DWT. 

Experimental findings agree with the research of Tello et. al. in that the DWT handles ocean 
waves very well. Because OASIC is designed for optical and not SAR imagery, clouds are 
a concern while radar associated clutter is not. OASIC makes no attempt at masking clouds, 
however, and relies on the versatility of the DWT to spot vessels through partial cloud cover 
and ignore large clouds with gradual changes in pixel intensity. 

2.2 Feature Classification 

Once low level feature extraction has been performed, OASIC combines the outputs of the DWT 
into an input vector which is fed to a Support Vector Machine for training and classification. 

2.2.1 Support Vector Machine 

Other ship detection methods have also combined feature extraction methods and learning al¬ 
gorithms for similar detection and compression purposes to OASIC such as Fang [II]. Their 
research differs in that their learning algorithm is a neural network, and compression is per¬ 
formed by vector quantization. Thus, they do not explore the DWT, SVM and compression 
algorithm combination that OASIC employs. 

The work of Zhu [10] is similar to OASIC in that they use the DWT for feature extraction 
with the optimal three-octave pyramid and also use an SVM for classification. OASIC differs 
significantly, however, in that it performs no additional filtering of the DWT products before the 
classification stage, and accepts a certain number of false positives as inevitable. OASIC also 
makes no attempt to identify what kind of vessel the object is, its course, or its speed. 
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Work by Mattyus [12] bears similarities to OASIC as they, too, use the DWT for low level 
feature extraction and a learning algorithm to perform the classification step. Rather than using 
water-masking to eliminate surface features from consideration, OASIC uses terrain-masking 
which is functionally the same method. However, their use of the DWT outputs differ in that the 
coefficient sub-bands are directly used by a classifier. Instead, their learning algorithm relies 
on derived Haar-like features and AdaBoost to form their classifier. Their detector performs 
multiple passes at different rotations, where this step is not needed for OASIC. 

Rainey [13] uses a similar combination of feature extraction via wavelets and multiple types of 
strong and weak classifiers including SVM. OASIC differs in that it solely relies on the DWT 
for feature extraction and SVM for classification with the goal of facilitating better compression 
performance. Although not used for ship detection, the methodology of Schneiderman [14] is 
similar in that the DWT is used for feature extraction and the resultant coefficients are fed into 
an SVM, albeit with additional processing. 

The work of Corbane [15] [16] [17] describes the use of DWT for feature extraction, and also 
discusses the merits of separating large images into more managable chunks of equal size called 
tiles. OASIC also uses tiles in the same way, performing the DWT to extract low level features 
from a single tile, then performing the classification on those features via SVM. 

As stated by Degirmenci [18], SVMs can be relied upon to provide excellent classification but 
care must be taken to select good features for training and classification as SVMs tend to be 
processing intensive otherwise. 

The efforts of Corbane, Mattyus [12], Zhu [10], and Rainey [13] describe a similar method 
in their works, but differ from OASIC in that their goal is ship detection. OASIC uses ship 
detection only for the purposes of compression. The general shape of the area encompassing 
the detected object is not important, and the number of false positives is not as critical to OASIC 
for this reason. 

2.3 Compression 

The overarching purpose of OASIC is to reduce the amount of channel capacity needed to down¬ 
link a satellite image while retaining high fidelity for ships within an image. To accomplish this, 
OASIC uses artificial vision to separate vessels within an image and all else remaining into two 
layers. The first layer, the foreground, contains the detected ships. The second layer, the back¬ 
ground layer, contains everything else including the ocean, clouds and any terrain that has not 
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been removed via terrain mask. Both foreground and background layers are then compressed 
through conventional means at different fidelity settings. 

Work for this purpose is similar to Marcia [19] in that it allows for a high resolution image 
with pockets of high fidelity to be reconstructed from a sparse dataset. The implementation, 
however, differs from OASIC in that they do not use detection or artificial vision to define areas 
of an image that are to compressed with a higher fidelity as OASIC does. 

2.3.1 Lossless and Lossy Image Compression 

A lossless image is one that contains the exact same pixels before and after being uncompressed. 
When compressing natural images, there is often a chaotic element that is difficult to compress 
losslessly and still achieve a reduction in size. This type of compression is invaluable in ap¬ 
plications where the pixel values themselves are used to glean additional intelligence from an 
image, such as reading aircraft markings from a wing of a jet on an aircraft carrier. Such fine 
details may be obliterated by lossy compression. 

In the early 1990s, driven by the emergence of the Internet and the demand for multimedia over 
a bandwidth-limited dial-up connection, lossy compression became popular in the form of the 
JPEG standard. Lossy images are compressed image that sacrifice fidelity, often in subtle or im¬ 
perceptible ways, to create a smaller file than can be achieved with lossless compression alone. 
Very high compression ratios can therefore be achieved at the cost of fidelity. OASIC uses both 
of these types of compression: lossless on the foreground, and lossy on the background. 

2.3.2 JPEG2000 

JPEG2000 is a relatively new compression standard that can compress images in both lossly 
and lossless modes. This algorithm offers excellent compression performance with a variable 
level of quality for its lossy mode making it ideal for use in OASIC. The JPEG2000 compression 
standard as defined by Skodras [20], is used to compress both foreground and background image 
layers. 

He et al. [21] describe the process by which the JPEG2000 algorithm continues to divide an area 
of an image using a quadtree via successive wavelet decompositions during lossy compression. 
To minimize the file size of a lossy JPEG2000 image used by OASIC’s background, OASIC 
suppresses detected objects within the image so that it contains only low-frequency ocean pixels. 
This step prevents the need for additional wavelet decomposition thereby reducing the file size 
and improving its compression efficiency. 
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2.3.3 Selective Compression 

The method of seleetive high-fidelity eompression deseribed by Mekisso [22] differs in that the 
eoordinates of a bounding box that define the area of high fidelity are provided to the eneoding 
funetion. OASIC aims to determine the number, size and position of bounding boxes itself using 
artifieial vision. Furthermore, the seleetive eompression performed by OASIC is performed on 
two images that have been segmented from the same souree with different fidelity settings and 
two different eompression methods. 

Compression of a eomposite image of two different layers has been performed by Kiely [9] 
who used a lossless (JPEG-LS) eompression paired with a lossy (JPEG-2000) to obtain similar 
results, validating the method. 

OASIC makes use of effieient paeking of reetangles, implementing a derivative of Korf [23] 
to paek deteeted objeets in preparation for lossless eompression. OASIC assumes an optimal 
reetangle’s horizontal width is a multiple 16 to faeilitate the most effieient eompression. 

Compressing images tile-wise is diseussed by Eowler [24]. While OASIC does not eompress 
in this manner, it performs the DWT based feature extraetion and elassifieation tile-wise at an 
optimal tile size of 512 x 512. This method is diseussed in greater detail in Chapter 3. 

Similar to work demonstrated by Xing [25], OASIC ean also eompress irregularly shaped ob¬ 
jeets, though this is done by simply enelosing the irregular shape in a reetangle and setting all 
non-objeet pixels to designated transparent pixel value (defaulted to blaek), or using an alpha 
ehannel if all 256 possible pixels are already present in the shape to be eompressed. 
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CHAPTER 3: 
Methodology 


In this chapter, the methods used to implement the preparation, feature extraetion, training, 
elassifieation, and eompression are diseussed. The basie operation of the OASIC algorithm ean 
be broken up into two major parts as shown in Figure 3.1. The Deteetion Stage is further broken 
up into Feature Extraetion and Classifieation. 


Raw Image Detection Stage Compression Stage 



OASIC Algorithm 



Compressed 
Image File 


Figure 3.1: Simple representation of the two major stages of OASIC: Detection and Compression. 


3.1 Image Preparation 

OASIC expeets 8-bit per ehannel panehromatie (grayseale) images. It does, however, support 
eolor images, though they are eonverted to panehromatie and downsampled automatieally be¬ 
fore testing. The 8-bit, panehromatie image limitation is imposed in order to determine the 
OASIC’s performanee when eompressing one of the more limited forms of eommonly used 
satellite imagery. 

3.1.1 Terrain Masking 

The goal of OASIC is to preserve the vessels at sea within an image. It is therefore advanta¬ 
geous to remove any terrain from an image to both prevent it from eonsuming preeious ehannel 
eapaeity, and to prevent the elassifier from erroneously deteeting ships ashore. 

To address this issue, all land terrain is replaeed with blaek, and the bordering oeean texture 
is faded into the newly erased areas. In this way, the DWT does not produee lines of high 
energy eoeffieients at the interfaee between the blaeked out shores and the oeean whieh may be 
mistaken for lines of vessels by the elassifier. 
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For the following experiments in Chapter 4, this step is applied manually, however, assuming 
the satellite’s position and eamera angle is preeisely known, an existing veetor based nautieal 
chart can be converted into a mask and used to remove the terrain in order to automate this step. 



Figure 3.2: An example of a vector-based Digital Nautical Chart (DNC) which could be used to 
automatically remove most of the terrain from a satellite image. 


3.1.2 Converting to Panchromatic 

Images may be color but must be converted to an 8-bit channel panchromatic image. Early ex¬ 
perimentation indicates there is no difference in performance when using color images that are 
converted to panchromatic compared to images that are natively panchromatic. It is suspected, 
though untested, that IR or hyper-spectral images would work as well. 



Figure 3.3: The original unprepared image (left) is converted to single channel panchromatic and the 
terrain is removed (right). 
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3.1.3 Bit Depth Scaling 

The OASIC algorithm is designed to compress 8-bit images only. Any images with a larger bit 
depth are scaled to 8-bit before they are processed. 

3.2 Feature Extraction 

Feature extraction is a necessary step in preparing the image for training and prediction by a 
classifier. Simply loading the raw pixel data into a classifier algorithm is seldom effective or 
efficient. Instead, feature extraction aims to obtain information about not only the pixel, but 
the pixel’s interactions with its surroundings that differentiate the objects within the image. 
The classifier uses these differentiating features to attempt to separate the objects from their 
background. The features may constitute a smaller set of data than the raw pixel data, but this 
is not always the case: OASIC’s feature dataset is often many times larger. 

3.2.1 Tiling 

A tile is a smaller subset of the larger image. OASIC examines the given image one tile at a 
time beginning at the upper left corner of the image and ending at the lower right corner. Each 
tile is square, comprising 512 x 512 pixels. If the image dimensions are not multiples of 512, 
OASIC automatically pads the image accordingly with copies of adjacent pixels. 

3.2.2 Discrete Wavelet Transform 

OASIC uses the DWT to extract the necessary features from each 512 x 512 tile. Each wavelet 
decomposition produces four coefficient matrices called sub-bands. The DWT was chosen for 
feature extraction for its native ability to separate low frequency waves from high frequency 
waves such as the edges separating ocean from ship. The DWT is defined in Equation (3.1) 
where Wf is the resultant coefficient matrix of the input image / and mother wavelet function 
(j) {x, y). The parameters are for scale, and t = (C, ty) . Equation (3.2) defines the mother wavelet 
function. The algorithm used for applying the DWT to a two dimensional images or matrix is 
described by Mallat [26]. 


Wf{s,tx,ty) = [f{x,y)-^s,tix,y)] 


(3.1) 




1 




X-tr 


y-ty 

S 


(3.2) 
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DWT 


LL 




HL 
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HH 


Figure 3.4: An image is decomposed via DWT to produce four 4 sub-bands. Note: LL is a half-scale 
version of the original. 


3.2.3 Sub-bands 

Each of the four sub-bands are unique: LL is the low-pass matrix of the souree tile, HL is the 
horizontal coeffieient matrix, LH is vertical coefficient matrix and HH is the diagonal (upper- 
left to lower-right) eoeffieient matrix. Eaeh sub-band is half the dimensions of the souree tile 
along both x and y axes. Therefore, after a single deeomposition, all sub-band matriees eontain 
256 X 256 coefficients. 


L-l /L -1 


^ 1=0 \fc 2=0 


(3.3) 


L-l /L-l 


fHL{g i)(/^y)= ^ /^4g)(2/ + fc^,27-ffc2)-%2 )-41 

ki=0 \fc2=0 


(3.4) 


L-l /L-l 


fLH{8-l)^.j^^ ^ Y^f^^is)(^2i + h,2j + k2)-lk2]-hk, 

ki=0 \k2=0 


(3.5) 


L-l /L-l 


^ £/4^)(2/ + fci,27 + ^2) ■%2)-^fei 

ki=0 \k2=0 


(3.6) 


The eomputation of the four DWT sub bands (LL, HL, LH and HH) is deseribed by equations 
(3.3), (3.4), (3.5) and (3.6), respeetively, where 7 ) represents the eoeffieients for sub- 
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band Z with resolution of g according to Bogush [7]. 4^, 1^2, 4i and 42 are the low pass filter 
and highpass filter coefficients respectively. L is the horizontal and vertical dimension of the 
matrix the DWT is applied to. 

Because the DWT is calculated by not only examining a pixel’s intensity but also that of its 
neighbors, the results contain a spatial data component organized by the three sub-bands. The 
HL (horizontal) sub-band will respond greater to intensity gradients between a pixel and the 
pixel to its right, the LH (vertical) sub-band will respond greater to gradients between a pixel 
and the one below it and the HH (diagonal) sub-band will respond greater to gradients to the 
lower right. 


3.2.4 Pyramid 

After decomposing a tile into its component sub-bands, the LL sub-band can be further decom¬ 
posed into yet another four coefficient sub-bands, divided again by 2 along both axes yielding 
a new octave of sub-bands containing 128 x 128 coefficients. This process can be repeated, 
forming additional four sub-band pyramids until the sub-bands are 1 x 1. 



Figure 3.5: 


A pyramid with 9 octaves, each containing 4 wavelet sub-bands. 


The pyramid height, or highest number of octaves to be added to the pyramid, is configurable. 
Adding octaves to the pyramid generally results in more detections, but requires more computa¬ 
tion time and memory. Furthermore, too many octaves within the pyramid will cause too many 
non-ship pixels to be detected. 
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Because each octave is repeatedly divided, each successive octave contains only one quarter 
the coefficients as its precursor. To properly calculate a feature vector that samples the ap¬ 
propriate DWT coefficients from each pyramid octave, multiple coordinate transforms must be 
performed for every pixel of data within the source image. While mathematically straightfor¬ 
ward, performing the transform can be prohibitively slow as there are often millions of pixels 
in the satellite image, with several octaves, each with three sub-bands to calculate per pixel. 

OASIC uses a shortcut that yields the exact same results yet performs far faster. The shortcut is 
to scale each octave to match the size of the largest octave at 256 x 256. Once all octaves are the 
same size, they can be combined into a three dimensional matrix, and the feature vectors can 
be used to create a larger wavelet pyramid vector with no additional floating point operations 
as shown in Figure 3.6. The speed boost comes at the cost of memory as each scaled pyramid 
octave consume 2^” times as much memory where n is the octave. 

Sampled Pixel 

r* Octave 


Z""* Octave 

3'"'Octave 

4^^ Octave 

5^^ Octave 

Wavelet Pyramid Vector 

Figure 3.6: An example of how a single coefFicient of a wavelet sub-band is aligned to its four higher 
octaves and the appropriate coefFicients are retrieved and combined into a feature vector. 

When scaling wavelet pyramid octaves, the scaled image may be interpolated using nearest 
neighbor with no distortion as the octaves are always interpolated by integer factors. However, 
OASIC offers the ability to interpolate the pyramid octaves using bilinear, trilinear or bicubic 
filters. True positive (TP) pixels are correctly identified pixels, while false positive (FP) pixels 
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are non-ship pixels erroneously identified as ship-pixels. Using these interpolation methods 
has the effect of improving TP pixel detection rates significantly over nearest neighbor while 
raising the FP pixel rates by a much smaller rate. The results of using bicubic interpolation will 
be shown in Chapter 5. 



Figure 3.7: The 3-octave is scaled by nearest neighbor (top) or bicubic filter (bottom). 


3.3 Classification 

The heart of OASIC is its ability to properly identify each pixel of an image as belonging to 
either a ship or the ocean. Unlike many ocean vessel detection schemes, OASIC makes no 
attempt to recover any additional information about the ship such as its speed, course, type, or 
identity. The goal of OASIC’s classification is to determine the vessel’s existence and location 
within the image for the purpose of selective compression only. 

Therefore, OASIC is tolerant of much higher false positive pixel rates than other detectors. The 
emphasis is on maximizing true positive pixels at the expense of detecting false negative pixels 
(ocean features erroneously detected as ships, such as wave crests). 


3.3.1 Support Vector Machine 

The SVM was chosen as OASIC’s classifier due to its excellent operating characteristics when 
training and predicting between only two labels: ships and ocean. 

Inputs to the SVM are provided by the wavelet pyramid and its octaves. The coefficients of 
multiple octaves spanning the pyramid are retrieved and are then combined into the Wavelet 
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Pyramid Vector for each pixel within the image. The LH, HL and HH sub-band values from 
each octave are combined in the order defined in Equation (3.7) where m is the number of 
octaves to be sampled, and Equation (3.8) where S is the Wavelet Pyramid Vector with n pixel 
samples. The S vector, once calculated, will be passed onto the SVM. Note: The EE sub-band 
is not part of the S vector. 


aVn =< LHi,LH2...LH,n >,aHn =< HLi,HL2...HL^ >,aD„ =< HHi,HH2...HHm > (3.7) 


S =< aVQ,aHQ,aDQ,aVi,aH\,aD\...aVn,aHn,aDn > 


(3.8) 


3.3.2 Training 

OASIC uses a single 512 x 512 pixel representative image for training. This image contains 
clouds, large and small vessels, cloud shadows and some wave crests. Once feature extraction 
has been performed, the SVM trains on this image’s pyramid. Paired with the 512 x 512 pixel 
training image is a matrix of ground truth labels of the same dimensions called an Annotation 
Eabel Matrix, labeling each individual pixel as either ship or non-ship. 

3.3.3 Prediction 

Each 512 X 512 tile from the source image is supplied to the feature extractor which performs 
the exact same processes on this image as the training image. Note that due to the 2" tile 
dimensions, no pyramid octave can ever overlap adjacent tiles and no seams or artifacts are 
produced by tiling due to borders between adjacent tiles. 

During prediction, the SVM fills a label matrix for each tile which is combined to form an 
matrix of predicted labels of the same dimensions as the original image. Erom this matrix, the 
ships can be extracted from the background in a process called Eayer Segmentation. 

3.4 Layer Segmentation 

Once the matrix of predicted labels is calculated, the source image can be segmented into two 
distinct regions: the foreground and background layers. The foreground layer contains all de¬ 
tected ship-pixels while the background contains all other pixels. Eayer segmentation permits 
selective compression as both layers can be compressed independently. 
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3.4.1 Two-Layer Method 

The most straightforward method to take advantage of the two segmented layers is to eompress 
the foreground with a lossless fidelity, allowing the pure blaek pixels to serve as transparent 
pixels, or ineluding a 1-bit transpareney mask whieh itself can be efficiently compressed. The 
background is compressed with a low quality lossy compression. The two files are combined in 
the same container file. 

This method makes no attempt to take advantage of the known location of the foreground objects 
within the image. Rather, the Two-Layer Method relies on foreground compressor to efficiently 
compress the layer by taking advantage of the long runs of zeros present between objects in the 
sparsely populated foreground layer. 




Two-Layer Method 


SVM 


Labels 


Tramuig 


Prediction 


Original Image 


Training Image 


Training Labels 


Dilated Labels 



Compressed Image 




(Objects Suppressed) 


Figure 3.8: A block diagram displaying the Two-Layer Method flow. 
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3.4.2 Bounding Rectangle Method 

The Bounding Rectangle method takes advantage of the separation of foreground objects from 
the background ocean but further breaks down the foreground to eliminate the empty space 
between detected clusters of pixels. The foreground layer is decomposed into rectangles by 
using a quadtree algorithm. 




Original Image 


Training Image 


Training Labels 


Bounding Rectangle 
Method 


Boimding Rectangles 


SVM 


Dilated Labels 


Traimng 




Labels 


Predict^! Labels 


Prediction 


Compressed Image 



:ts Suppressed) 


Figure 3.9: A block diagram displaying the Bounding Rectangle Method flow. 


3.4.3 Decomposition by Quadtree 

The purpose of the quadtree is to provide a list of coordinates that define axis-aligned bounding 
rectangles that enclose ship clusters. OASIC’s implementation of the quadtree does not create 
a quadtree data structure. The quadtree functions by subdividing an image into four quadrants 
without cutting any objects into pieces. The two axis-aligned dividing lines for the new division 
start at the center and are perpendicular to each other. If the dividing lines fall on a non-zero 
pixel value, two temporary lines are created along the same axis and shift along the dividing 
line’s perpendicular axis in both directions until one of the temporary lines no longer falls on 
a non-zero pixel or reaches the border of the image. The first temporary line to find a row or 
column with no pixels will become the new location for that dividing line. Once both horizontal 
and vertical dividing lines are established, the image is divided into four smaller images and 
each subdivision is recursively subdivided further. Once the temporary lines cannot avoid non¬ 
zero rows, an image can no longer be divided. The result is that all objects or clusters of 
objects are enclosed by axis-aligned rectangles to the closest extent possible. The enclosing 
axis-aligned rectangles are illustrated in Figure 3.10 enclosing ships. 
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Figure 3.10: Two objects are bounded by a quadtree algorithm. 


The implementation of the quadtree used is designed speeifieally for OASIC to be very fast, 
even when working with very large images as long as they are sparsely populated. Onee the 
foreground has been decomposed into a number of varying sized rectangles, all empty rectan¬ 
gles are deleted, and each remaining rectangle is left enclosing one or more groups of fore¬ 
ground pixels. For each rectangle, the upper-left coordinates are stored along with the dimen¬ 
sions. 

Each object bearing rectangle, henceforth referred to as an sub-image, must still be compressed. 
Early experimentation showed that compressing each individual sub-image quickly grew costly 
due to the objects being too small for entropy based compressors to be efficient. Furthermore, 
each compressed sub-image contained its own header, sometimes larger than the sub-image 
itself. To address the inefficiencies of individual sub-image compression, an efficient rectangle 
packing algorithm is employed to combine all sub-images into a single foreground composite 
rectangle. 

3.4.4 Efficient Rectangle Packing 

Efficient rectangle packing permits the merging of all foreground sub-images into one large 
rectangular image with minimal gaps. 

Using a derivative of the method described by Korf [23], any arbitrary number of irregularly 
shaped rectangular, axis-aligned sub-images can be packed quickly and efficiently. Figure 3.11 
displays a packed rectangle with the largest vessels placed first, and the smaller vessels used 
to fill in any gaps. Figure 3.12 is the same algorithm used on an image containing nearly 600 
detected vessels. Once assembled, the composite foreground rectangle is then compressed as 
the new pseudo-foreground along with the coordinates of the sub-images within both the packed 
rectangle and the foreground image. The sub-image dimensions are also stored. Each sub-image 
requires 12 bytes of overhead to store its coordinates and dimensions. 
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The shape of the reetangle is as elose to a square as possible to equalize the number of pixels in 
both the horizontal and vertieal axes. Many lossless eompression algorithms sueh as JPEG2000 
take advantage of spatial repetition. This method, by virtue of the paeking algorithm, maximizes 
this repetition along both horizontal and vertieal axes and permits better lossless eompression. 
Early experimentation indieated the improvement in effieieney to be relatively minor, espeeially 
with large images. 



Figure 3.11: Example of efFicient packing of a few sub-images. 
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Figure 3.12: Example of efFicient packing of nearly 600 sub-images. 


3.4.5 Object Dilation 

Detecting large vessels with features extracted with the Discrete Wavelet Transform can be 
problematic as their internal areas often have little contrast and tend to not stimulate a large 
enough response from the DWT for the Support Vector Machine to recognize them as ship- 
pixels. This shortfall leaves large gaps within the detected area of a vessel as shown in Figure 
3.13(a). 
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A simple and effective solution is to designate a radius around each detected pixel as ship-pixels, 
even if they were not originally classified as such. The morphological preprocessing operation 
used achieve this result is called dilation [27, p. 490] and is computationally inexpensive. Di¬ 
lation causes more of the surrounding ocean to be preserved, can fill hollow spaces and close 
gaps within larger objects. Dilation, however, increases the number of false positive pixels. 

3.4.6 Object Preprocessing Solutions 

More dilation generally means more of the vessel is detected. However, with 4 pixel dilation, 
or even 8 pixel dilation, gaps still exist for the largest ships as shown in Figure 3.13(b) and 
(c) respectively. OASIC supports two additional preprocessing solutions that attempt to better 
enclose the vessel using the ship-pixels that are detected. 



Figure 3.13: a. No dilation b. 4-pixel dilation c. 8-pixel dilation d. Solid Rectangle e. Filled Object 
The foreground is indicated by the lighter gray and the background by darker blue. 

Solid Rectangle 

A simple and effective solution is to simply enclose the entire cluster of pixels within a bound¬ 
ing rectangle as shown in Figure 3.13(d). The entire bounding rectangle of the sub-image is 
captured and added to the foreground. 

Filled Object 

The Filled Object method is a preprocessing operation designed for use with OASIC. It bears 
some resemblance to Smart Snakes by Cootes [28] but differs in implementation. Filled Object 
uses orthogonal rays cast from the top, bottom, left and right edges of the sub-image to fill in the 
object. The rays terminate once encountering a ship pixel. When all rays have been terminated, 
any pixels not traversed by a ray are classified as ship pixels as shown in Figure 3.13(e). This 
method works best if some dilation is used first. 
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3.5 Selective Compression 

Most compression schemes work by taking advantage of the inherent redundancy found in an 
image. OASIC, however, takes advantage of the relative sparsity of ships present within the 
ocean. Only the detected ships are preserved by the lossless compression of the foreground, 
while the ocean is distorted by the extremely lossy background compression. 




No Suppression 
Foreground Overlay 


No Suppression 
No Foreground 


Figure 3.14: Images with suppression (top) suffer from less noise than those without suppression 
(bottom) where ringing artifacts are more prominent. Images with foreground (gray) disabled (right) 
shows that suppression removes some distortion from the background (blue). 
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3.5.1 Object Suppression 

Although the background is aheady compressed with a lossy algorithm, configured with a low 
fidelity, and achieves a tremendous reduction in size, there is one further optimization: The 
objects that have been detected and compressed as the foreground are still present in the back¬ 
ground. By removing them, less information needs to be stored to represent them since they are 
aheady stored in the foreground at a higher fidelity. This step also eliminates the occurrence of 
ringing artifacts around the object that extend beyond the original objects boundaries as shown 
in Figures 3.14(a) and (c). 

The detected objects are removed from the background layer by replacing their pixels with a 
content-aware gradient of pixel shades as shown in Figure 3.14(b). Suppression of background 
objects not only improves compression but improves the fidelity of partially detected ships as 
undetected internal areas are not corrupted by the ringing artifacts caused by the unnecessary 
compression of the ship in the background. This corruption is shown in Figure 3.14(c) and (d). 


3.5.2 JPEG 

JPEG, defined by ITU-T T.81 and ISO/IEC 10918-1 [29] is a lossy compression format with 
an adjustable fidelity that encodes an image with a discrete cosine transform (DCT), quantizing 
the products and achieving an impressive compression ratio. OASIC evaluates this method’s 
performance as a background compressor. 


3.5.3 JPEG 2000 

JPEG2000 is defined by ITU-T T.800 and ISO/IEC 15444-1 [20] and functions in both lossy 
and lossless modes. 

Eossy JPEG2000 encodes an image in much the same way as the detector stage of OASIC, in 
that it decomposes an image into a pyramid using the Discrete Wavelet Transform (DWT). Eike 
JPEG, it too, has a configurable fidelity. Due to its superior method of storing the coefficient 
products of the DWT over regular JPEG, JPEG2000 achieves a much better compression ratio 
with far better quality. 

OASIC evaluates both of this method’s modes, using lossless for its foreground compression 
and lossy for its background compression. The actual file container format used by both OASIC 
and for comparison with OASIC is the IP2 minimal JPEG2000 format [20]. 
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3.5.4 PNG 

The PNG standard is define by ISO/IEC 15948 [30]. This is another lossless file eompression 
format that funetions very similarly to the GIF file format it was intended to replaee. OASIC 
evaluates this method as well for use in eompressing its foreground. 
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CHAPTER 4: 
Experimentation 


The evaluation of OASIC’s algorithms is a multi-part problem. A large testing set of anno¬ 
tated oceanic satellite imagery must be evaluated for detection with different configurations and 
compression compared to both lossy and lossless algorithms. 


4.1 Equipment and Software 

Computer Specs 

All testing is performed on one system with an AMD Athelon™ 64 X2 Dual Core CPU at 
2.6GHz with 4.00Gb of RAM running Windows 7 64-bit Home Premium. 

Implementation 

OASIC’s algorithms are written in Mathworks MATLAB (R2012b) due to the ease of pro¬ 
cessing large amounts of data in matrix form. MATLAB also natively provides support for 
configuring, saving and loading exotic image formats such as lossless JPEG and JPEG2000. 
The only non-standard MATEAB toolbox used is the EibSVM for the Support Vector Machine. 


4.2 Testing Performance 

The performance of OASIC is evaluated at different stages: The Detection Stage’s Wavelet 
Pyramid configuration and dilation/preprocessing options are tested and the Compression Stage’s 
performance is compared to both JPEG2000’s lossless and lossy modes. 

4.2.1 Image Annotation 

Just as with the training image discussed in Chapter 3, all satellite images to be tested are first 
annotated. Because OASIC is foremost a compression algorithm, and not a detection algorithm, 
vessels are not distinguished from each other in the Annotation Eabel Matrix supplied with each 
satellite image. Annotation is done on a per-pixel level, with a 1 corresponding to a ship-pixel 
in the source image and a 0 corresponding to a non-ship pixel. Red pixels represent ship pixels 
in Eigure 4.2. 


29 



4.2.2 Detection Evaluation 


Judging whether a ship has been sueeessfully deteeted is not neeessarily a straightforward prob¬ 
lem. Simple methods sueh as simply enelosing both true and deteeted ships in a bounding box 
and measuring their area of overlap are quiek but dependent on the orientation of the ships. 
Vessels at diagonal orientation do not fit effieiently within reetangles and ean impaet testing 
aeeuraey. 

OASIC uses a per-pixel evaluation by eomparing every deteeted pixel to every annotated pixel 
and ealeulating the pereentage of deteeted pixels for the entire image and for ship elusters. 
This analysis ean be effieiently performed by using Equation 4.1 to determine the pereentage 
of deteetion per the entire image. Note that this evaluation method is more preeise and henee 
strieter than the reetangle overlap method or other eommon methods used to evaluate deteetors. 

Onee an OASIC eompressed image is uneompressed, a eopy of the original predieted label ma¬ 
trix is derived from its foreground layer D. To eonvert these values baek into binary values, the 
mathematieal sign is used. The sgn{D) ean be thought of as a bit mask, and when applied to the 
Annotation Label Matrix R using the logieal AND operator, the only pixels remaining are true 
positive pixels. By eonverting these pixels into binary values using the mathematieal sign fune- 
tion, summing them and then dividing the sum by the total ship pixels, the True Positive Pixel 
deteetion rate (Tp) is ealeulated. In the equation, m and n eorrespond to the image dimensions 
in pixels, and i and j are their indiees. 


Tp 


L m— 1 
!=0 


Lt-oLURiiJ] ; 


*100 


(4.1) 


The result is the True Positive Pixel deteetion rate for the entire image. As mentioned before, 
the Annotation Label Matrix does not distinguish individual ships from one another. Therefore, 
in order to gather per-ship cluster statisties, the Annotation Label Matrix must be broken up into 
loealized ship elusters (Shown red in Ligure 4.1). Lortunately, the quadtree algorithm diseussed 
in Chapter 3 is perfeet for this task. 

Onee the pereentage of ship pixels within a loealized ship eluster is ealeulated, the performanee 
of the deteetor ean be further broken down: Any pereentage below 50% is eonsidered a failure 
to deteet the ship. The number of deteetions at 50%, 75% and full 100% are ealeulated and 
graphed. 
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Figure 4.1: Image with the ship clusters enclosed in rectangles (red). 


4.2.3 Detector Configuration 

Pixel dilation, pyramid octaves and object preprocessing options are all configurable and all 
effect detection efficiency. 

Pyramid performance with different octaves are tested to determine the best number of octaves 
to use for detection. All octaves beyond the first are scaled using nearest-neighbor interpolation, 
but the results of using bicubic interpolation are tested as well. 

Five preprocessing configurations are analyzed: No dilation, dilation with a 4-pixel radius, and 
dilation with an 8-pixel radius. The Solid Rectangle and Filled Object (using 4-pixel dilation) 
methods are also tested. 

The result of the detection experiments are presented as a Receiver Operating Characteristic 
(ROC) curves which are well suited to spot trends in the relationship between True Positive 
Pixels and False Positive Pixels. ROC curves will be produced for the different pyramid config¬ 
urations, different dilation options, solid rectangle and filled object methods. 

4.2.4 Comparing Compression Ratios 

The compression ratio (C/R) is defined as the original uncompressed image size divided by the 
compressed file size as shown in Equation 4.2. The value of a compression ratio R is expressed 
R: 1 (pronounced R to 1.) 
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(4.2) 



To calculate the compressed file sizes, both PNG and lossless JPEG2000 are considered. 

The lossless PNG format produces larger files than the lossless JPEG2000 algorithm in all of the 
10 large satellite images tested. (An average of 17% larger.) Similarly, the lossy JPEG format 
introduce approximately 8% more noise to an image than its lossy JPEG2000 counterpart for 
the same file size. Eor these reasons, comparisons are made only using lossless JPEG2000 for 
the foreground layer, and lossy JPEG2000 for the background layer. Comparing OASIC to 
JPEG2000 in lossless mode is done by simply calculating the compression ratios of the two and 
using this comparison as a measure of OASIC’s performance. 

To evaluate OASIC’s performance in ocean imagery compression, the testing set is compressed 
both in OASIC’s OAI format, and JPEG2000’s minimal JP2 format. Because OASIC gains its 
efficiency by taking advantage of the relative sparsity of ships at sea compared to the ocean 
and masked terrain, small image chips will perform less favorably when compared to full scale 
satellite images. Eor this reason, full sized satellite images are evaluated to test performance by 
compressing them with the OASIC algorithm with the optimum pyramid configuration, dilation 
and preprocessing options. 

The images are compressed within five kilobytes of their OASIC counterpart’s file size with the 
minimal lossy JPEG2000 format (JP2). The noise produced by both algorithm’s lossy compres¬ 
sion is evaluated to compare fidelity. 

OASIC’s lossy background layer’s compression ratio is fixed at 500:1. Therefore, the theoretical 
maximum compression ratio for any OASIC file is l/500th the uncompressed size. (With no 
ships present in this extreme case.) 

4.2.5 Comparing Fidelity Loss 

All lossy compression algorithms introduce noise, however, OASIC and JPEG2000 distribute 
their noise in completely different fashions. This experiment will confirm that OASIC intro¬ 
duces less errors to the ship pixels than JPEG2000 does for the same file size. 

JPEG2000’s lossy mode cannot be compared by compression ratio because its compression ra¬ 
tio is dependent on its fidelity setting. In order for such a comparison, both OASIC and the 
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lossy JPEG2000 must have a similar level of fidelity for such a comparison to be meaningful. 
This is problematic because it is difficult to match fidelity, or level of noise, between the two 
compression algorithms. However, JPEG2000’s target file size can be precisely set (within ap¬ 
proximately 5 kilobytes), allowing for lossy JPEG2000 compressed files to match compression 
ratios with their OASIC compressed counterparts. The errors (inverse of fidelity) for both files 
are then calculated and compared to measure the performance of both algorithms. 

To evaluate the overall error introduced by the lossy compression, the PSNR (Peak Signal to 
Noise Ratio) must be calculated. This is done by first calculating the MSE (Mean Square Error) 
from the original image I and the lossy compressed image K as shown in Equation 4.3 where 
m and n are the dimensions of the image. Once the MSE M is obtained, the PSNR P can be 
calculated using Equation 4.4 with b as the common bit depth of the images. (All images are 
8-bit for this experiment.) The PSNR is in decibels (dB), with higher values indicating higher 
fidelity of the lossy image, and the lower values indicate worse fidelity. An infinite PSNR 
indicates a lossless image. 
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The PSNR metric is most useful when comparing the exact same regions of the same images, so 
the entire image is evaluated, and the PSNRs of the individual ships are summed and evaluated 
separately. 

4.2.6 Image Set 

The images used for training the Support Vector Machine are crucial to the performance of 
OASIC. Training images should ideally match the expected circumstances of the image to be 
compressed, if known. Poor weather should warrant a training image with more cloud cover, 
while rough seas should necessitate a training image with the presence of white caps. (Waves 
crests that appear white from above.) If the user or satellite does not have any knowledge of the 
weather or sea state before hand, a generic image can be used to train with such as indicated in 
Eigure 4.2. 
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Training Images 

OASIC can be configured to use any training image, however, the image must be 512x512. 
Only one training image was used for all experiments. The image used is depicted in Figure 
4.2. 



Figure 4.2: A training image with its associated labels (red) showing a mix of large and small vessels 
and clouds. 


Testing Images 

The image test set is comprised of several color images from around the world including heavily 
trafficked ports, open ocean, extreme cloud cover, and sea states from a calm 0 to a tumultuous 
7 on the Beaufort Scale. All images were obtained from commercial satellites and provided by 
Space and Naval Warfare Systems (SPAWAR). 

All images are subsequently converted to panchromatic for testing. For the compression exper¬ 
iments, 10 full sized (221.5 to 775.5 megapixels) images are used. 

For ship detection experiments, 25 image chips (1 to 16.8 megapixels) are extracted. This step 
is done for speed considerations yet will have no effect on accuracy so long as the 25 images 
are sufficiently representative of the environments found in the 10 images. 
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CHAPTER 5: 
Results 


The performance of OASIC is analyzed according to the criteria described in Chapter 4, ad¬ 
dressing first ship detection accuracy, then lossless and lossy compression. 

5.1 Ship Detection 

Preliminary analysis has indicated that an optimal pyramid configuration is useful for discerning 
waves and clouds from vessels as they otherwise may confuse the SVM. Determining such a 
configuration is the first experiment. Once the best pyramid configuration is established, all 
subsequent experiments use this configuration. 

5.1.1 Optimum Pyramid Configuration 

Various pyramid configurations are tested on an annotated image containing clouds, masked 
terrain and ships of varying sizes and orientations. For the pyramid configuration tests, no pixel 
dilation or any other preprocessing method is applied to its predicted label matrix. The inde¬ 
pendent variable is the number of pyramid octaves while the dependent variables are numbers 
of true positives pixels and false positive pixels (ocean pixels misidentified as ship pixels). The 
results appear in Figure 5.1. This experiment’s results indicate that a three octave pyramid is 
the most accurate, agreeing with previous work by Huang [8], Kiely [9] and Zhu [10]. 

As described in depth in Chapter 3, scaling each pyramid octave to match the dimensions of the 
largest octave provides a speed boost because complicated coordinate transforms are no longer 
needed. Normally, Nearest Neighbor interpolation is used when scaling octaves to precisely 
emulate the slower coordinate transform that it replaces, but bicubic interpolation can be used 
instead as shown in Figure 3.7. Repeating the experiment with this method yields an additional 
10% boost to accuracy as shown in Figure 5.2. 

A 3-octave pyramid scaled with bicubic interpolation is used in the detection stage for all sub¬ 
sequent experiments. 
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1 to 4 Pyramid Octaves 



5 to 8 Pyramid Octaves 



Figure 5.1: With no pixel dilation or octave interpolation, eight pyramid octave combinations are 
tested. The 3-octave pyramid performs the best. 
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Figure 5.2: The 3-octave bicubic interpolated pyramid (solid line) provides better performance than 
the standard 3-octave pyramid (dotted line). 

5.1.2 Preprocessing Options 

This experiment tested 25 images eontaining 444 ship elusters of varying sizes and orientations 
in a wide variety of environments. The tests were done with no dilation, 4-pixel dilation, 8- 
pixel dilation. Solid Rectangle and Filled Object with the results shown in Figures 5.3, 5.4 
and 5.5. The ship clusters detection rates are plotted at 50% or greater, 75% or greater and 
100% detection intervals. The raw pixel rates are measured and plotted on the same graph as 
well. The independent variable in this test is the dilation or preprocessing method while the 
dependent variables are the true positive pixels and false positive pixels. 

The optimal pixel dilation radius appears to be 8-pixels, as this method contains the highest 
number of detections, at only a minor cost to the false positive pixel rate. Pairing 4-pixel dilation 
with the Filled Object preprocessing method produces results very similar to those produced by 
the Solid Rectangle method as shown in Figure 5.5. The Filled Object preprocessing method 
does not appear to perform better than others such as 4-pixel, or 8-pixel dilation. 
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ROC Detection Curve - 25 Images 
No Dilation 



ROC Detection Curve - 25 Images 
4-Pixel Dilation 



Figure 5.3: Performance with different preprocessing options: No dilation (top) and 4-pixel dilation 
(bottom) 
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ROC Detection Curve - 25 Images 
8-Pixel Dilation 



ROC Detection Curve - 25 Images 
Solid Rectangle 



Figure 5.4: Performance with different preprocessing options: 8-pixel dilation(top) Solid Rectangle 
(bottom) 
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ROC Detection Curve - 25 Images 
4-Pixel / Filled Object 



Figure 5.5: Performance of the 4-pixel / Filled Object preprocessing method 

5.2 Compression Ratios 

The compression ratios of OASIC and lossless JPEG2000 are shown in Figures 5.6 and 5.7 for 
four of the five tested preprocessing methods. The No Dilation method performs poorly and is 
omitted from these charts. The average compression ratio for all four methods is 113:1, which is 
14 times greater than JPEG2000’s lossless compression. The 4-pixel dilation method provides 
the best compression ratio. 

The larger vessels tend to contain hollow voids with only their outlines being detected. Dilation 
fills these voids, improves detection and reduces noise. However, it can undermine compres¬ 
sion efficiency by adding pixels around smaller vessels that do not suffer from the hollow void 
phenomenon. For this reason, 8-pixel dilation performs poorly as the number of pixels filled in 
is not proportionate to the number of false negative pixels generated. The false negative pixels 
generated by 8-pixel dilation will cause vessels that are close to each other to be merged under 
the same ship cluster and cannot be divided by the quadtree when attempting to decompose the 
foreground, causing more non-ship pixels to be stored in the foreground, undermining the com¬ 
pression ratio. Solid Rectangle and Filled Object both offer better performance compression 
performance because they discriminate which ships will gain additional additional pixels. Both 
can fill the voids within larger vessels with a minimal impact to smaller vessels. 
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Figure 5.6: Compression Ratio performance of OASIC when compared to lossless JPEG2000 using 10 
satellite images with both 4 and 8 pixel dilations. 
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Figure 5.7: Compression Ratio performance of OASIC when compared to lossless JPEG2000 using 10 
satellite images with both Solid Rectangle and Filled Object preprocessing methods. 
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5.3 Fidelity Loss 

OASIC’s compression algorithm takes advantage of the sparsity of ship pixels in relation to the 
surrounding oeean. As with all lossy compression algorithms, information must be discarded. 
For files of eomparable size, lossy JPEG2000 and OASIC differ in how they distribute the data 
loss as demonstrated in Figure 5.8.To illustrate the data loss the original uneompressed image 
(left) is subtraeted from a lossy JPEG2000 image (center) and from an OASIC compressed 
image (right). The zoomed in areas (inset) indicate the most important difference between the 
two algorithms: How they distribute noise. The ships detected during OASIC eompression 
experience much less noise than lossy JPEG2000 at the same compression ratio. 



Figure 5.8: The errors are distributed evenly through the ocean and ships with JPEG2000 while with 
OASIC the ships remain largely error free. Perfectly detected vessels exhibit no error. Errors only 
occure when ships containing mis-classified (false negative) pixels. 


Eigure 5.9 and 5.10 display the individual Peak Signal to Noise Ratios for all images. Eor 
OASIC, both the overall PSNR and the PSNR for only the ship elusters are shown. A eompletely 
noiseless image eauses the PSNR to approach infinity, so all graphs are limited to a PSNR of 
40dB. In all oases exoept one (4-pixel dilation. Image 10) OASIC has less noise than any lossless 
JPEG2000 with the same file size. 

Note: Image 10 is nearly entirely obsoured by olouds with a single ship. The deteotion stage 
elassifies over 70% of the image as ship pixels. This phenomena is called over-deteetion, and is 
caused by heavy clouds and excessive waves. 

A side by side oomparison of an OASIC oompressed ship, deteoted at 85%, and a completely 
undeteeted ship are shown in Eigures 5.11 and 5.12. Note that while both the large and small un- 
deteeted vessels have lost fine detail, they are still reoognizable and are not eompletely obsoured 
by the lossy baekground compression. 
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Figure 5.9: These graphs show the relative PSNR levels of OASIC compared to lossy JPEG2000 using 
10 satellite images with 4-pixel dilation (Top) and 8-pixel dilation (Bottom) 
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Figure 5.10: These graphs show the relative PSNR levels of OASIC compared to lossy JPEG2000 
using 10 satellite images with the Solid Rectangle method (Top) and Filled Object method (Bottom) 
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Undetected Ship 85% Detected Ship 



Figure 5.11: Large undetected ships (left) suffer from compression induced noise, and fine details are 
lost. Even partially detected ships fare better (right). 


Undetected Ship 100% Detected Ship 



Figure 5.12: Undetected smaller ship (left) and a fully detected ship (right). 
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5.3.1 Best Configuration 

Figure 5.13 displays the relationship between average eompression ratios and average PSNR 
levels. 4-pixel dilation has the most noise due to its relatively low deteetion rate, despite having 
an excellent compression ratio. Similarly, 8-pixel dilation has the lowest noise while its com¬ 
pression ratio was the lowest. The Solid Rectangle Method performed the best in terms of total 
image noise overall. The Filled Object method, however, achieved the second highest PSNR 
for the ship clusters at an average of 41.5dB and also has a C/R of 117:1 making it the best 
preprocessing method. 

Note: The infinite PSNR values were clipped to 75dB for calculation of the average. 

OASIC to JPEG2000 Comparisons 

AO Dilation Image 
A4 Dilation Image 
As Dilation Image 

■ Solid Image 
• Filled Image 
Q JP2K Image 
AO Dilation Ships 
^4 Dilation Ships 
As Dilation Ships 

■ Solid Ships 

■ Filled Ships 
eSJP2K Ships 

0 25 50 75 100 125 150 175 200 225 250 

Average 

Compression Ratio (C/R) 

Figure 5.13: The performance of all five preprocessing methods are graphed for both the entire image 
(blue) and ships only (red). Higher PSNR and higher compression ratios indicate better performance. 
The best configuration is the Filled Object method. 
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5.4 Future Research 

Research for OASIC has not concluded. A vast range of potential improvements remains, and 
fertile ground exists for improvement. 

5.4.1 Code Optimization 

OASIC’s implementation in MATLAB does not fully take advantage of capabilities MATLAB 
provides, many repetitive tasks could be accomplished faster by use of MATLAB’s powerful 
matrix processing operations. In order to eventually use OASIC aboard a satellite as intended, 
use of other languages should be examined as well as different platforms such as digital signal 
processors (DSP) and Field-Programmable Gate Arrays (FPGA). The OASIC algorithm takes 
about 50-90 minutes to compress and store each of the full resolution images. This time varies 
greatly due to three factors: how many pixels are examined, how many pyramid octaves are 
used, and how many ships are detected. Preprocessing options have an effect to a lesser ex¬ 
tent. Improvements can be made by streamlining the repetitive operations present in both the 
detection and compression stages. 

5.4.2 Automatic Configuration 

When OASIC’s detector erroneously classifies ocean waves as ships, the number of detections 
skyrockets. The SVM detection results of each 512 x 512 tile could be analyzed for this condi¬ 
tion and if necessary, the sensitivity reduced, and tile recomputed. Each tile could be analyzed 
in this way, perhaps adjusting the pyramid configuration as well. Lossy foreground compression 
could also be evaluated for further compression ratio gains. 

5.4.3 Testing and Training Image Set 

OASIC only trains on a single image, future research could determine the effects of multiple 
training images, including rough seas and heavy cloud cover, both environments that caused 
over-detection. OASIC only tests 8-bit panchromatic images, future research could focus on 
the use of SAR imagery, multi and hyperspectral images with more than 8 bits per channel. 
OASIC is limited to 10 high resolution images, and future research could test on many more to 
better refine performance results. 
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5.4.4 Other Ship Detectors 

OASIC’s detection stage is not compared to other ship detectors. Different feature extraction 
and classification methods may perform better than the DWT and SVM implementation used 
by OASIC and could permit vast improvements to compression. Future research could focus 
on comparing current ship detector’s to OASIC and what effect adopting better detectors would 
have on compression performance. 

5.4.5 Digital Nautical Charts 

The entire image pre-processing step can be automated with the aid of vector-based Digital 
Nautical Charts. It would require terrain landmarks to be identified and the appropriate DNC to 
be rectified, (rotated, scaled and adjusted for distortion) before being overlaid over the image. 
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CHAPTER 6: 
Conclusions 


6.1 Capabilities 

The results of the analysis indieate that OASIC does in faet validate the eoneept of Content- 
Aware Adaptive Compression of Satellite Imagery Using Artificial Vision. It outperforms the 
lossless JPEG2000 format’s compression ratio with acceptable loss in fidelity, and it outper¬ 
forms the lossy JPEG2000’s format in fidelity for a file of equal size and compression ratio. 

6.1.1 Ship Detection 

In 10 images, containing a total of 3014 ship clusters, OASIC’s best preprocessing configuration 
was with using Eilled Object, with 4-pixel dilation. This condifuration detected 2947 ships 
above 50% for a ship detection rate of 84%. Of the nearly 7 million ship pixels in the entire 
image testing set, OASIC successfully classified 5 million for a total ship pixel detection rate of 
72%. 

While successful, OASIC also produced a total of 1.4 billion false positive pixels out of ap¬ 
proximately 6 billion pixels total. This accounted for 99.8% of the pixels detected. A majority 
of these false positive pixels are from three large images (6, 9 and 10) that suffered from over¬ 
detection, and nearly all pixels in the images were classified as ship pixels. Disregarding the 
outliers, the false positive rate drops to 76%, over twice as many false positive pixels for every 
true pixel detected. 

Despite the high volume of false positive pixels, the overall compression ratio and PSNR of the 
images were still very high or at a minimum matching JPEG2000. This is because the false 
positive pixels tended to be clustered around the ships and not scattered throughout. Many of 
the false positive pixels near the ships are captured in the same rectangle that would enclose the 
ship anyway, and therefore incur a minimal loss of compression efficiency, if any. 

The efficiency of the Solid Rectangle method is mostly dependent on the orientation of the 
vessels it encloses and is very inefficient for large ships at diagonal angles. While it guarantees 
all ship-pixels within its bounds are preserved in the foreground, it does not perform as well as 
the Pilled Object in preserving ships with minimal noise. 
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The Filled Object method can provide the highest detection rates processing multiple types 
of vessels of different sizes. This method performs better than Solid Rectangle, and has the 
most potential for improving the detection rate while having a minimal negative impact on 
compression ratios. 

6.1.2 Image Compression and Fidelity 

OASIC when compared to lossless JPEG2000 typically achieved a 17 to 1 compression advan¬ 
tage while achieving an average PSNR above 35dB (nearly flawless.) 

OASIC’s PSNR fares much better than intuition might dictate, but there is an explanation: Just 
because a pixel is not detected does not mean it is lost. The lossy background compressor may 
distort the undetected ship values, but the lower their frequency the less distortion they will 
sustain. Fortunately, most of the the high frequency pixel clusters (that would suffer the most if 
not detected) happen to be pixel clusters most likely to be detected. 

Suppression of detected objects in the background contributes to OASIC’s high PSNR. The 
lossy JPEG2000 algorithm produces intense ringing artifacts, especially around pixel clusters 
of high frequency, such as ships. By OASIC suppressing the majority of the ships in the lossy 
background, these artifacts are generally suppressed as well. Eigure 3.14 demonstrates this the 
best when comparing the pier in (a) and (b) versus (c) and (d). 

6.1.3 Summary 

In all tests, the worst OASIC performed is equal to lossless JPEG2000. Should the OASIC 
algorithm be implemented on an imaging satellite, the benefit would be a significant reduction 
in required channel capacity and time to download an image from space. 

Vessels at sea would benefit from this improvement the most: Maritime Domain Awareness, 
anti-piracy operations, law enforcement at sea and other operations at sea would all benefit 
from getting the satellite borne intelligence into the hands of the operator faster. Vessels with 
smaller antennas such as submarines and patrol craft would greatly benefit from OASIC. In 
the case of submarines, fine-detailed OASIC-compressed satellite imagery of the surrounding 
ocean could be downloaded quickly, reducing the time the submarine must spend on the surface 
to access the satellite. 

Satellites using OASIC could be engineered to have even larger spatial resolutions and multiple 
spectral bands with less concern of ever-increasing power and mass requirements. 
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APPENDIX A: 
OAI File 


Files compressed by OASIC are stored in an OAI file which begins with a 3 byte header: 
[01] Storage Method: 

’S’ - Bounding Rectangle 
’L’ - Two-Layer 

[02] Foreground Compression: 

’2’ - JPEG2000 
’J’ - JPEG 
’P’ - PNG 

[03] Background Compression: 

’N’ - NONE (No Background) 

’2’ - JPEG2000 
’J’ - JPEG 


’P’ - PNG 


Structure for the Two-Layer method: 

[04] FG; 32-bit unsigned Foreground size: 

[08-i-FG] Compressed Foreground image 

[09-(-FG] BG; 32-bit unsigned Background size: 

[13-i-FG] Compressed Background image 

Structure for the Bounding Rectangle method: 

[04] N; 16-bit unsigned number of sub-images: 


In the following six fields x must iterate from 0 to N-1 


[06-^x*96] 

[08-(-x*96] 

[10-hx*96] 

[12-hx*96] 

[14-hx*96] 

[16-^x*96] 

[16-^N*96] 

[20-hN*96] 

[20-i-N*96+PK] 

[24+n*96+PK] 


xSrc(x); X coordinate in main image 
ySrc(x); Y coordinate in main image 
xSize(x); X size of sub-image 
ySize(x); Y size of sub-image 

xPos(x); X coordinate of sub-image within packed rectangle 
yPos(x); Y coordinate of sub-image within packed rectangle 
PK; 32-bit unsigned Packed Rectangle size: 

Compressed Packed Rectangle 
BG; 32-bit Background size: 

Compressed Background Image 
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